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Abstract 

Let X = (xo, . . . ,x n -\) be a sequence of n numbers. For e > 0, we say that cc, is an e- 
approximate median if the number of elements strictly less than Xj, and the number of elements 
strictly greater than X{ are each less than (l+e)n/2. We consider the quantum query complexity 
of computing an e-approximate median, given the sequence X as an oracle. We prove a lower 
bound of f2(min{i,n}) queries for any quantum algorithm that computes an e-approximate 
median with any constant probability greater than 1/2. We also show how an e-approximate 
median may be computed with 0{- log(-) loglog(-)) oracle queries, which represents an im- 
provement over an earlier algorithm due to Grover fll] , |l2|| . Thus, the lower bound we obtain 
is essentially optimal. The upper and the lower bound both hold in the comparison tree model 
as well. 

Our lower bound result is an application of the polynomial paradigm recently introduced 
to quantum complexity theory by Beals et al. Q. The main ingredient in the proof is a 
polynomial degree lower bound for real multilinear polynomials that "approximate" symmetric 
partial boolean functions. The degree bound extends a result of Paturi [15] and also immediately 



yields lower bounds for the problems of approximating the /cth-smallest element, approximating 
the mean of a sequence of numbers, and that of approximately counting the number of ones of 
a boolean function. All bounds obtained come within polylogarithmic factors of the optimal 
(as we show by presenting algorithms where no such optimal or near optimal algorithms were 
known), thus demonstrating the power of the polynomial method. 



1 Introduction 



1.1 Synopsis 

Proving non-trivial lower bounds for any universal model of computation is a formidable task, and quantum 
computers are no exception to this. It is thus natural to seek bounds in restricted settings. The first such 
step in the field of quantum computation was taken by Bennett et al. Q. They prove that we cannot 
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solve NP-complete problems in sub-exponential time on a quantum computer merely by adopting the 
brute-force strategy of "guessing" solutions and checking them for correctness. Nonetheless, Grover's 
search algorithm [10| shows that a quadratic speed-up over classical algorithms is possible in this case. 



Thus, while the parallelism and the potential for interference inherent in quantum computation are not 
sufficient to significantly speed up certain strategies for solving problems, they do give some advantage over 
probabilistic computation. These results motivate the question as to whether similar speed up is possible 
in other scenarios as well. 

Strategies such as 'brute-force search' may formally be modelled via "black-box" computations, in which 
information about the input is supplied to the algorithm by an oracle. For example, the black-box search 
problem may be defined as follows: given oracle access to n bits X = (xq, . . . ,x n -\), compute an index i 
such that Xi = 1, if such an index exists. A simpler formulation would require a yes/no answer according 
to whether such an index exists or not. This amounts to computing the logical OR of the input bits. In 
the black-box setting, strategies are evaluated by studying the query complexity of the problem, i.e., the 
minimum, over all algorithms, of number of times the oracle is accessed (in the worst case) to solve the 
problem. In the case of the abstract search problem, the query complexity is the number of bits that need 
to be examined (in the worst case) in order to compute the logical OR of the n bits. 

Considerable success has been achieved in the study of the query complexity of computing boolean functions 
in the quantum black box model, both in terms of optimal lower bounds for specific functions |9], |J , 
and in terms of general techniques for proving such lower bounds (||, 0, However, few approaches were 
known for the study of more general functions. Consider, for example, the problem of approximating 
the median of n numbers. An e- approximate median of a sequence X = (xq, . . . ,x n -i) of n numbers is 
a number x% such that the number of Xj less than it, and the number of Xj more than it are both less 
than (1 + e)§. The problem then is to compute such an X{, given, as an oracle, the sequence X of input 
values, and an explicitly specified parameter e > (which may be assumed to be at least 4-). Grover gave 
an algorithm for finding an e-approximate median that makes O(-) queries to the input oracle ]|i"T| |l2| . 
(Here, the O notation suppresses factors involving log(^) and M, where M is the size of the domain 
the numbers are picked from.) Thus, an almost quadratic speed up over the best classical algorithm was 
achieved (assuming M to be constant). However, it was still open whether this algorithm could be improved 
upon. In particular, known techniques such as the "hybrid argument" yielded a lower bound of ^(^) f° r 

the number of queries [18], whereas O(-) was suspected to be optimal. In this paper, we prove a lower 



bound of fl(-) for the query complexity of the approximate median problem, thus showing that Grover's 
algorithm is almost optimal. We also present a new 0(- log(^) log log(^)) query algorithm for the problem, 
thereby eliminating the dependence of the upper bound on M. The upper and the lower bound both also 
hold in the comparison tree model, in which one is interested in the number of comparisons between the 
input elements required to compute an e-approximate median. 

Our lower bound is derived via the polynomial method recently introduced to the area of quantum com- 
puting by Beals et al. Q. They show that the acceptance probability of a quantum algorithm making T 
queries to a boolean oracle can be expressed as a real multilinear polynomial of degree at most 2T in the 
oracle input. Thus, if the algorithm computes a boolean function of the oracle input with probability at 
least 2/3, the polynomial approximates the function to within 1/3 at all points in the boolean hypercube. 
So, by proving a lower bound on the degree of polynomials approximating the boolean function, we can 
derive a lower bound on the number of queries T the quantum algorithm makes. We cannot, however, fol- 
low this particular route for the problem of approximating the median, since the restriction of the problem 
to boolean inputs does not yield a well-defined function. Nonetheless, the restriction does yield a partial 
boolean function, i.e., a function that is not defined at all points of the domain. Our result is thus based 
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on a degree lower bound for polynomials that "approximate" partial boolean functions. This degree lower 
bound generalizes a bound due to Paturi and also gives lower bounds for the problems of approximat- 
ing the kth. smallest element, approximating the mean of a sequence of numbers, and that of approximately 
counting the number of ones of a boolean function. All bounds obtained are almost tight (as we show by 
presenting algorithms where no such optimal or near optimal algorithms were know), demonstrating the 
power of the polynomial method. 



1.2 Summary of results 

Consider a partial boolean function / : {0, l} n — > {0, 1}. We say a real n-variate polynomial p approximates 
the partial function / to within c, for a constant < c < 1/2, if 

1. for all X G {0, l} n , p{X) G [-c, 1 + c], and 

2. for all points X at which / is defined, \p{X) — f(X)\ < c. 

Our main theorem gives a degree lower bound for polynomials approximating partial boolean functions 
of the following type. For X = (xq, . . . , x n -i) G {0, 1}™, let \X\ = Yh=o x * be * ne number of ones in X. 
Further, let £,£' be integers such that < £ ^ £' < n. Define the partial boolean function fi^i on {0, 1}™ 
as 



1 if \X\ 
if \X\ 



Let m G {£,£'} be such that | S — m| is maximized, and let = \£ ■ 



Theorem 1.1 Let p be any real n-variate polynomial which approximates the partial boolean function fi 
to within c, for some constant c < 1/2. Then, the degree of p is Q^n/ Ai + ^Jm(n — m)/A^). 



This theorem subsumes a degree lower bound given by Paturi [15] for polynomials approximating (total) 
symmetric boolean functions. 

We say that an algorithm A, possibly with access to an oracle, computes a partial function / on {0, l} n , 
if Pr[A(X) 7^ f(X)] < 5 for all inputs X for which / is defined, where 5 is some constant less than 1/2. 
For boolean /, we say that the algorithm accepts an input X if A(X) = 1. Theorem 1.1 , when combined 
with a characterization due to Beals et al. (Lemma 4.2 of [|l|) of the probability of acceptance of a quantum 
algorithm on a boolean input oracle, in terms of polynomials, gives us the following result. 

Corollary 1.2 Any quantum black-box algorithm that computes the partial boolean function fg£', given the 
input as an oracle, makes Jl(-y/n/A^ + ^Jm(n — m)/ A?) queries. 

This lower bound also holds for the expected query complexity of computing the partial function ft//. 
Using an approximate counting algorithm of Brassard et al. []|, |l4|, q], we show that our query lower bound 
is optimal to within a constant factor. 

Theorem 1.3 The quantum query complexity of computing the partial function ftp, given the input as 
an oracle, is 0{\Jn/ Ap + ^Jm{n — ra)j Ag). 
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The result of Beals et al. mentioned above then immediately implies that the degree lower bound of 
Theorem 1.1 is also optimal to within a constant factor. 



Corollary 1.4 For any constant < c < 1/2, there is a real, n-variate polynomial p of degree 0{\Jn/ Ai + 
yjm(n — in)/ At) that approximates the function fyy to within c. 



Corollary \1.2\ enables us to prove lower bounds for the query complexity of computing the statistics listed 
below, given, as an oracle, a list X = (xo, . . . ,x n -i) of (rational) numbers in the range [0,1] and an 
explicitly specified real parameter e > or A > 0. We may assume e to be in the range [l/(2n), 1), and A 
to be in [1/2, n). 

1. e-approximate median. A number x% such that \{j : Xj < x{\\ < (1 + e)n/2 and \{j : Xj > Xi}\ < 
(l + e)n/2. 

2. A- approximate fcth-smallest element. (Defined for 1 < k < n.) A number X{ that is a jth- 
smallest element of X for some j in the range {k — A, k + A). 

3. e-approximate mean. A number /x such that |/i — ^x\ < e 5 where fix = h J2i=o x i ^ s the mean of 
the n input numbers. 

4. A-approximate count. (Defined when Xi G {0, 1} for all i.) A number t such that \t — tx\ < A, 
where tx = |-^| = J2?=o x i ^ s the number of ones in X. 

5. e-approximate relative count. (Defined when x^ E {0, 1} for all i.) A number t such that \t — tx\ < 
etx, where tx is defined as above. 

Note that some of the problems defined above are very closely related to each other. Problem 2 is a natural 
generalization of problem 1; problem 4 is, of course, the restriction of problem 3 to boolean inputs (with A 
defined appropriately), and problem 5 is a version of problem 4 where we are interested in bounding relative 
error rather than additive error. In the case of problems 1 and 2, we may relax the condition that the 
approximate statistic be a number from the input list (with a suitable modification to definition 2 above); 
our results continue to hold with the relaxed definitions. (Problem 1 was first studied by Grover [11, 12] 
with this relaxed definition.) 

We first prove a lower bound for approximating the fcth-smallest element by showing reductions from 
partial functions of the sort described above. 

Theorem 1.5 At least ^l(^yn/A + y/k(n — k)/ A) oracle queries are made by any quantum black-box al- 
gorithm for computing a A-approximate kth-smallest element. 

We thus get a lower bound for the approximate median problem as well. 

Corollary 1.6 The quantum query complexity of computing an e-approximate median is Q(l/e). 

We also propose an algorithm for approximating the feth-smallest element that comes within a polyloga- 
rithmic factor of the optimum. 

Theorem 1.7 Let N = y/n/A + yjk(n - Fj/A. There is a quantum black-box algorithm that computes 
a A-approximate kth-smallest element of n numbers given via an oracle, with 0(N log(A) log log(iV)) 
queries. 
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This gives us a new algorithm for estimating the median. Our algorithm represents an improvement over 
the algorithm of Grover jn|, 12] when the input numbers are allowed to be drawn from an arbitrarily large 



domain. 

Corollary 1.8 0(-log(-) log log(-)) queries are sufficient for computing an e- approximate median in the 
black-box model. 

This gives us an almost quadratic speed up over classical algorithms in the worst case. 

A very natural measure of complexity of computing functions such as the fcth-smallest element of a given 
list of numbers is the number of comparisons between the input elements required for the computation. 
To study this aspect of such problems, one considers algorithms in the comparison tree model. In this 
model, the algorithm is provided with an oracle that returns the result of the comparison x, < Xj when 
given a pair of indices rather than an oracle that returns the number Xi on a query i, where the x^s 

are understood to be the input numbers. The query complexity of a problem such as computing the 
minimum or the median then exactly corresponds to the number of comparisons required to solve the 
problem. The lower and the upper bounds given above for estimating the /cth-smallest element and 
the median continue to hold in the comparison tree model. In particular, if we set A = 1, we get 
an almost optimal 0{\Jk{n — k + 1) ) comparison algorithm for computing the fcth-smallest element (c.f. 
Theorems |ll] and |1.7[ ). (An optimal 0{^Jn) comparison algorithm was already known for computing the 
minimum of n numbers §.) This should be contrasted with the bound of 0(ra) in the classical case |J. 



Corollary 1.9 Let N = \/k(n — k + 1). Any comparison tree quantum algorithm that computes the kth- 
smallest element of a list of n numbers makes £1(N) comparisons. Moreover, there is a quantum algorithm 
that solves this problem with 0(Nlog(N) loglog(iV)) comparisons. 



Another application of Corollary |1.2| is to the problem of approximating the mean. Grover [12] recently 
gave an O(^-loglog^) query algorithm for this problem, which is again an almost quadratic improvement 
over classical algorithms. When the inputs are restricted to be 0/1, the problem reduces to the counting 
problem. Using the approximate counting algorithm of Brassard et al. mentioned above, we show that the 
computation of the mean can be made sensitive to the number of ones in the input, thus getting better 
bounds when \t — n/2\ is large. 

Theorem 1.10 There is a quantum black-box algorithm that, given a boolean oracle input X, and an 
integer A > 0, computes a ^.-approximate count and makes an expected 0(y/n/A + \/t(n — t)/ A) number 
of queries on inputs with t ones. 

We show that this algorithm is optimal to within a constant factor, and, in the process, get an almost tight 
lower bound for the general mean estimation problem. 

Theorem 1.11 Any quantum back-box algorithm that approximates the number of ones of a boolean oracle 
to within an additive error of A makes Q(y/n/ A + \Jt(n — t)/ A) queries on inputs with t ones. 



Corollary 1.12 The quantum query complexity of the e-approximate mean problem is &<{-)■ 

Brassard et al. ||, [14|, |6| study the version of the approximate counting problem in which one is interested 
in bounding the relative error of the estimate. We show that their algorithm is optimal to within a constant 
factor (when t < (1 — e)n). 
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Theorem 1.13 Any quantum back-box algorithm that solves the e-approximate relative count problem 
makes 



of / » | Vtjr^t) 



queries on inputs with t ones. 



Finally, we would like to point out that in view of Corollary 1.4 , the lower bounds stated above cannot be 
improved using the method we employ in this paper. In fact, we believe that the lower bounds are optimal, 
and that the upper bounds can be improved to match them (up to constant factors). 

2 The lower bound theorem and its applications 

This section is devoted to deriving a polynomial degree lower bound, and to showing how lower bounds 
for the query complexity of the different black-box problems defined in Section IO follow from it. We 



first prove the degree lower bound for polynomials in Section 2.1, and then apply the result to quantum 



black-box computation in Section 2.2 



2.1 A degree lower bound for polynomials 



We now prove our main result, Theorem 1.1, which gives a lower bound for polynomials approximating 
symmetric partial functions. The bound is derived using a technique employed by Paturi [|l^] for polyno- 
mials that approximate non-constant symmetric boolean functions. Our bound generalizes and subsumes 
the Paturi bound. 

We refer the reader to Appendix ^ for the definition of the concepts involved in the proof. Appendix [A] 
also summarizes the various facts about polynomials that we use to derive the bound. 



Our proof rests heavily on the inequalities of Bernstein and Markov (Facts A. 6 and A.5). The essence of 
these inequalities is that if there is a point in [—1, 1] at which a polynomial has a "large" derivative, and 
if the point is suitably close to the middle of the interval, the polynomial has "high" degree. 



Proof of Theorem [l.lfr Recall from Section 1.2 that f£ t £'(X) is a partial boolean function on {0, 1}' 



which is 1 when \X\ = £ and when \X\ = £' , that m is one of the integers 1,1' such that |n/2 — m\ 
is maximized, and that = \£ — £'\. We assume that p is an n-variate polynomial of degree d which 



approximates the partial function / to within 1/3 in the sense defined in Section L2. The constant 1/3 
may be replaced by any constant less than 1/2; the proof continues to hold for that case. Without loss of 
generality, we assume that £ > £' (we work with the polynomial 1—p, which approximates 1 — /, if I < £'). 



We begin by replacing p with its symmetrization p s y m and then using Fact A.l to transform it to an 
equivalent univariate polynomial q. (Since x 2 = x for x G {0, 1}, we may assume that p is multilinear.) 
We show a degree lower bound for q, thus giving a degree lower bound for p. 

In order to apply the derivative inequalities above, we scale to transform the polynomial q to an equiva- 
lent polynomial q over the interval [—1, 1], where q(x) = q((l + x)n/2). For i = 0, 1, . . . , n, let aj = 2i/n — 1. 
Clearly, q has the following properties: 

1. q has degree at most d. 
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2. \q(ai)\ < 4/3 for < i < n. 

3. q{at) > 2/3 and q(ae>) < 1/3. Thus, by the Mean Value Theorem, there is a point a in the inter- 
val [<v,a£] such that q'(a) > (2/3 — l/3)/(ag — ag/) = n/(6Ag). 

We prove two lower bounds for d, which together imply the theorem. The first of the lower bounds follows 
by applying the Markov Inequality (Fact |A.5| .l) directly to q. 



Lemma 2.1 d = Q(y/n/Ae 



Proof: We consider two cases: 



Case (a). || q \\ < 2. Combining property 3 of q listed above and Fact [A.5| .l, we get 

d 2 > q(a)/\\q\\ > n/{12A t ). 



Sod = n(y/njA e ). 

Case (b). || q\\ > 2. From property 2 of q listed above, every point at which q attains its norm is no 
more than 2/n away from a point at which \q(x)\ < 4/3. Hence, by the Mean Value Theorem, there is 
a point a G [—1,1] such that 

\q'(a)\ > (\\q\\ -4/3)/(2/n) > n\\q\\/6. 



The Markov inequality then implies d = ^l{^/n) = ^l(^/n/A~i). ■ 

The second of the lower bounds now follows from an application of the Bernstein Inequality for algebraic 
and trigonometric polynomials (Facts |A?5 .2 and A. 6, respectively). 



Lemma 2.2 d = Vl{\Jm(n — m)/Ai). 

Proof: Note that if q has norm less than 2, property 3 in conjuntion with Fact [A.5|.2 implies that 



2d > \\q\\d > VI - a 2 q(a) > y/l - a 2 (n/6A e ). 
But since a £ [ai> , ag\ , we have 

1 — a 2 > 1 — = 1 — (2m/n — l) 2 = 4m(n — m)/n 



So, d = Q,(^Jm{n — m)/Ag). 

Now suppose that || q \\ > 2. The proof in this case is not as straightforward as in Case (b) of the proof 
of Lemma 2.1, since Fact A. 5. 2 only gives us a bound which is sensitive to the point at which q has high 
derivative. However, it is possible to "damp" the value of the polynomial outside a suitable interval, and 
thus obtain the required bound. 

Let 6 be a point in [—1, 1] at which min x {|x| : \q(x)\ > 2} is attained, and let c be one of the numbers b, d£ 
such that |c| is minimized. We assume that c > 0, since the proof in the other case is similar. Let C be a 
constant such that < C < 0.01. We distinguish between two cases. 

Case (a), c < 1 — C. We consider a polynomial r defined as: 

r(x) = q(x + c)(l - x 2 ) dl 
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where d x = \6/C 2 ] d. The de gree D of r is clearly 0(d), so it suffices to prove the claimed lower bound 
for D. 

Suppose || r || < 2. Then, the following property of r gives us the required bound. If c = ap, then r(0) > 2/3, 
and we also have r(a^ — c) < 1/3. If c = b, then |r(0)| > 2, and moreover, there is a point c < c at a 
distance at most 2/n from c such that |r(c — c)| < 4/3. In either case, there is a point a G [a£/ — a^, 0] such 
that |r'(a)| = £l(n/Ai). We may assume, without loss of generality, that Ai < n/4, so that a G [—1/2,0]. 
(Indeed, since d > 1, we already have (i = f2(-\/?n(ra — m)/A^), if A^ > n/4). We may now invoke the 



Mean Value Theorem and Fact A.5 .2 to conclude that D = S7(n/A^) = Q(^/m(n — m)/ A(). 

We now focus on the case when || r || > 2. We show in Claim below that |r(x)| is bounded by 1 
for C < \x\ < 1. This implies that || r || (which is at least 2) is attained within [—(7,(7]. Note that r is 
bounded by 4/3 at points — c separated by at most 2/n in [— C, C]. So there is a point a £ [— (7, C] at 
which |r'(a)| > n || r || /6. Applying Fact A.5[ 2 to r at the point a, we get D = Q(n) = Q(\/m(n — m)/Af). 



It only remains to prove the following claim to complete the analysis of Case (a). 
Claim 2.3 For all x G [-1,-C] U [C, 1], we We |r(x)| < 1. 



Proof: Note that || q \\ = maxo< I < n By Fact A. 2, we thus have \\q\\ < (4/3) • 2 . In partic- 

ular, \q{x + c)\ < (4/3) • 2 d < (4/3) • e M for x G [—1,1 — c]. We give the same bound for |g(x + c)| 
for x G [1 — c, 1] by using Fact 



\q(x + c)\ < \\q\\-T d (x + c) < (4/3) • 2 d ■ e 2V * d < (4/3) • e M , 

since c < 1. Further, if C < |x| < 1, we have (1 — x 2 ) dl < e _:!;2dl < e _6rf . Combining these two inequalities, 
we may bound r as follows: 

|r(x)| = \q(x + c)\ (1 - x 2 ) dl < (4/3) • e 5d ■ e~ ed < 1 

for x in the region [— 1, — C] U [C, 1]. ■ 

We now turn to the remaining case. 

Case (b). c > 1 — C. Without loss of generality, we assume that <£',£< n — (otherwise, 
the bound we seek follows from Lemma ^j] above, since \Jm(n — m)j An < ^n/Ai ). This implies, in 
particular, that c < 1. Let a c = cos -1 c. Since 0.99 < 1 — C < c < 1, we have < a c < 1/4. 

We prove a degree lower bound for a trigonometric polynomial s derived from q. The polynomial s is 
defined as: 

s (9) = q(cos6)[cos(d 1 {6-a c ))] d2 , 

where d\ = [l/(2a c )J and c?2 = c\ \d/d{\, for some integer constant c% > 1 to be specified later. Let D be 
the degree of the polynomial s. 

Claim 2.4 D = 0(d). 

Proof: First, note that since cos# > 1 — 6 2 /2 for 9 G [0, vr/2], we have 



— cos a r = 2V1 - c > 2J2A e / 



in. 
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The last inequality follows from the assumption that I < n — A/>. So d\ < l/(2a c ) = 0(y/n/ Ag ) which 
is 0(d), by Lemma 2.1. We may now bound -D as follows: 

.D < d + d 2 di = d + c\\d/dx\d\ < d + c^d + dx). 

SoD = 0(d). * 



Thus, it suffices to prove a lower bound of 0(ym(n — mj/At) for D, which we do next. 
Let U{ = cos -1 Gtj, for i = 0, . . . , n. 

Again, if [| s || is bounded by 2, we get the lower bound easily: if c = b, then |s(a c )| > 2, and there 
is a point a, at a distance at most 2/n to the left of c such that |s(a^)| < 4/3. We therefore have, for 
some a £ [a c , Oj], that |s'(a)| > (2/3) /(a^ — a c ). Moreover, by the Mean Value Theorem, we have Qj — a c = 
|cos«i — cosa c | / sind for some d G [a c , c^]. Note that 



sin d > sin a c > sin at > sin a m = \/l — al 



"rrr 



Thus, |s'(a)| > (2/3)i/l — af n /(2/n), which gives us D = Q(^m(n — m) ) = £l(y/m(n — m)/Ae), when 
combined with Fact |A.6| , the Bernstein Inequality for trigonometric polynomials. If c = ai, we can similarly 
argue that D = Q,(^m(n — m)/Ae). 



We now examine the case when || s || > 2. Claim 2.5 below shows that \s(6)\ is bounded by 1 when 9 £ 
[— 7T, — 7r + a c /2] U [— a c /2, a c /2] U [w — a c /2, tt]. We may assume that the point where the norm (which 
is greater than 2) is attained is in [0, tt]; the proof proceeds in an analogous manner in the other case. This 
point is then close to some point «j £ [a c /2, ir — a c /2] where |s(«j)| < 4/3. Arguing as before, we get that, 
for some points a, (3 £ [a c /2,7r — a c /2], |s'(a)| > || s \\ (sin/3)/3(2/n). Further, 

a c a c sina c sina m 

sin a > sin — > — > > . 

2~4~ 4 ~ 4 



From Fact A. 6 , we now get D = Q,(yjm(n — m)) = Q(\/m(n — m)/ A^). 
We now prove that s is bounded in the region mentioned above. 

Claim 2.5 For all 6 G [-tt, -it + a c /2] U [-a c /2,a c /2] U [it - a c /2, it] , we have \s(9)\ < 1. 

Proof: We prove the claim for 9 £ [0, a c /2]. The analysis for 9 in the other intervals is similar (one 
exploits the fact that q(cos 9) is an even function of 9, and that the corollary to Fact AS limits its behaviour 
outside [a c , 7T — a c ]). 

Let h(9) = [cos(di(0 - a c ))] d2 . Then, for 9 G [0,a c ], 

|fc(a c -0)| = Icos^itf)^ 2 < (1 - (di^) 2 ^)^ < e" d2(die)2/4 < e - Cl,i92/(16ac) . 
The first inequality follows from the fact that cos 

<t> < 1 - <^ 2 / 4 for G [0,7r/2] and that < d x a c < 1/2. 
The second is a consequence of 1 + x < e x . The remaining steps follow from the definitions of d\ , d 2 and 
the fact that a c < 1/4. 

Further, Corollary [A.4| gives us the following bound on the value of q outside the interval [— c, c]: 

\q(c + x)\ < 2\T d (l + x/c)\ < 2-e M v / 3^ 



9 



for x G [0, 1 — c]. Since, for £ [0, a c ], 

cos(a c — #) = cos a c cos + sin a c sin 6 < cos a c + a c 6 = c + a c 6, 

we have \q(cos{a c - 9))\ < 2 • e 2d \^ < 2 • e 4d ^. So, for 9 G [0,a c /2], 

= \q( C os(a c -(a c -9)))\\h(a c -(a c -e))\ < 1, 

provided ci is chosen large enough (as may readily be verified, bearing in mind that l/a c = 0(d)). ■ 
This completes the derivation of the second lower bound on the degree d of the polynomial q. ■ 
Lemmas 2.1 and 2.2 together imply that d = f2 ^max j-^/n/A^, yjm(n — mJ/Afjj , which is equivalent to 



the bound stated in Theorem |1.1| . 
2.2 Applications to quantum black-box computation 

In this section, we use our degree lower bound in conjunction with a result of Beals et al. p] to derive 



lower bounds for the quantum black-box complexity of approximating the statistics defined in Section 1.2. 
The key lemma of [jij which we require is the following. 

Lemma 2.6 (Beals, Buhrman, Cleve, Mosca, de Wolf) Let A be a quantum algorithm that makes T 
calls to a boolean oracle X. Then, there is a real multilinear polynomial p(xq, . . . , x n -\) of degree at most 2T 
such that the acceptance probability of A on oracle input X = (xq, • • • , x n _i) is exactly p(xq, . . . , x n -\). 



We deduce Corollary L2 from Theorem LI using this lemma. 

Proof of Corollary 1.2: Consider an oracle quantum algorithm A that computes the partial function f^p 
with constant error probability c < 1/2 by making at most T oracle queries. From the lemma above, 
we deduce that there is a multilinear polynomial p(xq, . . . ,x n -\) of degree at most IT that gives the 
acceptance probability of A with the oracle input X = (xo, • • • , x n _i). Clearly, p approximates fi^i to 
within c: p(X) > 1 — c when \X\ = £ and p(X) < c when \X\ = £', and, moreover, the value of p(X) is 



restricted to the interval [0, 1] for all X £ {0, l} n . Theorem 1.1 now immediately implies the result 



In the remainder of this section, we show how to reduce from partial function computations of the type 
given in Corollary [L^ to approximating the fcth-smallest element and to approximate counting, and show 
how bounds for approximating the median and the mean follow. In this way, we are able to show new 
quantum query lower bounds for the computation of these approximate statistics. 



The following two lemmas specialize Corollary 1.2 to cases of interest to us. The first deals with func- 



tions fi t e> such that neither £' nor £ is "close" to or n, and the second covers the remaining case. 

Lemma 2.7 Let k, A > be integers such that 2A < k < n — 2A. Then, the quantum query complexity 
of fk-A,k+A is Q(y/n/A + yjk(n - k)/A). 



Proof: We assume that k < n/2; the other case is symmetric. In applying Corollary |l.2| , A^ = 2A. 
Since k < n/2, m = k — A. Moreover, (k — A)(n — k + A) > (k/2)(n — k). Corollary |L2| now gives us the 
claimed bound. ■ 
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Lemma 2.8 Let k,A be integers such that < A < n/4 and < k < 2A. Then, the quantum query 
complexity of /o,fc+A is £l(y/n]~K + \/k(n — k)/ A). The same bound holds for ft-A,n if k > n — 2A. 

Proof: We prove the first part of the lemma; the other part follows by symmetry. In applying Corol- 



lary |1.2) , we have A^ = k + A < 3A, and m = 0. Hence, we get a bound of Q(-\/re/A) for /o,fc+A- For the 
lemma to hold, we need only show that the second term in the claimed lower bound is of the order of the 
first term: y/k(n -k)/A< ^(2A)n/A = 0(^/n/A). m 



We now prove the rest of the lower bound theorems of Section 1.2 by exhibiting reductions from suitable 



problems. We first consider the problem of estimating the /cth-smallest element. 



Proof of Theorem 1.5: We need only prove the bound when A < n/4, since it holds trivially otherwise. 
We assume that A is integral. The same proof works with [A] substituted for A for general A. 

Note that the query complexity of computing fop is the same as that of computing / ra _£ n _^/, since we 
may negate the oracle responses in an algorithm for the former to get an algorithm for the latter, and 
vice-versa. We now consider two cases: 

Case (a). 2A < k < n — 2A. Any algorithm for computing a A-approximate fcth-smallest element 



clearly also computes f n -k+A,n-k-A, and hence, by Lemma |2J and the observation above makes at 
least 0(^/n/A + \Jk(n — k)/ A) queries. 

Case (b). k < 2A or k > n — 2A. If k < 2A, we reduce from the function f n ,n-k-A to our problem. 
Lemma |2.8| along with the observation above now gives us the required bound. Similarly, for k > n — 2A, 
we reduce from f n -k+Afl and get the bound. 

This completes the proof of the theorem. ■ 

Since the problem of approximating the median is really a special case of the more general problem of 
estimating the /cth-smallest element, we get a lower bound for this problem as well. 

Proof of Corollary 11.61: For n odd, an e-approximate median is a A-approximate kth. smallest element 



for k = (n + l)/2, and A = [~(en + l)/2] . The lower bound of f2(l/e) now follows from Theorem 1.5. ■ 

The lower bounds for estimating the median and the /cth-smallest element continue to hold in the compari- 
son tree model, since any comparison between two input numbers (which is made by querying a comparison 
oracle in this model) can be simulated by making at most 4 queries to an oracle of the sort we consider 
above. 



The proofs for the lower bounds for approximate counting is similar to that of Theorem 1.5 above; we only 
sketch them here. 



Proof of Theorem 1.11: We may assume that A < n/6, since the lower bound is trivial otherwise. 
Consider any algorithm that approximately counts to within an additive error of A. Fix any < t < n. 
Suppose for any input X with \X\ = t, the algorithm outputs a A-approximate count after T queries with 
probability at least 2/3. We then consider the truncated version of the algorithm which stops after making T 
queries and outputs 1 if the approximate count obtained (if any) lies in the range (t — A, t + A) and 
otherwise. Since the original algorithm approximates to within A for all inputs, the truncated algorithm 
computes ftj+\2A] and/or ftj-\2A] whenever these partial functions are well-defined (i.e., when t + 2A < n 
and/or t — 2A > 0). Now, by considering the four cases t < 4A, n — t< 4A, 4A < t < n/2 and n/2 < t < 
n — 4A separately, and reducing from a suitable partial function (out of ft,t+\2A~\ an d ft,t- [2A]) i n each of 
these cases, we get the claimed lower bound. ■ 
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Since the problem of approximate counting is a restriction of the more general problem of estimating the 



mean of n numbers, the lower bound for the latter problem follows directly from Theorem 1.11 



Proof of Corollary 1.12: If the input numbers are all 0/1, multiplying an e-approximate mean by n 
gives us an en-approximate count. From Theorem |1.11| , we get that in the worst case (i.e., when the 
number of ones in the input is L n /2j), the number of queries required to solve the approximate mean 
problem is 0(l/e). ■ 

Finally, we sketch the proof of the lower bound for approximate counting to within some relative error. 



Proof of Theorem 1.13 : To derive a lower bound for the number of queries T made to approximate 
the number of ones for X such that tx = t, we consider a truncated version of the algorithm obtained by 
running the algorithm till it returns a value between (1 — e)t and (1 + e)t with probability at least 2/3 for 
such inputs. Since the algorithm correctly approximates the count to within a relative error of e for all 
inputs, we can use it to compute the functions ft,t+i, when et < 1/4, and ff? t , where t! = |_(1 — e)t/ (1 + e)J , 
when 1/4 < et. Corollary 1.2 now gives us the claimed bound. ■ 



3 Some optimal or essentially optimal algorithms 

We now show that the quantum black-box bounds obtained in the previous section are either tight or 
essentially tight by giving algorithms for the problems for which no such (optimal or near optimal) algorithm 
was known. 



3.1 An optimal distinguisher 



Recall the problem of computing the partial function fgp defined in Section 1.2. In this section, we show 



how this partial function may be computed optimally, i.e., within a constant factor of the lower bound of 



Corollary |1.2| , thus proving Theore m |l,3j . Along with Lemma 2.6, this implies that the polynomial degree 
lower bound we show in Theorem |1.1| is within a constant factor of the optimal, and hence that it is not 
possible to obtain better lower bounds for the problems we consider using our technique. 

Our algorithm actually computes the partial function fi^i : {0, l} n — > {0, 1}, where <£'<£< n, defined 
as: 

j 1 if \X\ > I 

1 o if \x\ < e 



Clearly, any algorithm for this partial function also computes fie, and thus the lower bound for the latter 
also holds this function. (To compute fg^i when £ < £', it suffices to compute fy t and negate the output.) 

The algorithm D(X, £' ,£) for fe/i, which we call a distinguisher, is, in fact, an immediate derivative of an 
approximate counting algorithm of Brassard et al. & |14|, ||, which enables us to estimate the number of 
ones ty of a boolean function Y in a useful manner. 

Theorem 3.1 (Brassard, H0yer, Mosca, Tapp) There is a quantum black-box algorithm C(Y, P) that, 
given oracle access to a boolean function Y = (yo, • • • , Vn— 1)> an d an explicit integer parameter P, makes P 
calls to the oracle Y and computes a number t G [0, n] such that 



u j. i ^ VM" ~ *v0 , \n-2t Y \ 
\*Y — t\ < — h 



P 4P 2 
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with probability at least 2/3. 

Let X be the input to the distinguis 
let P 



■z(y/n/A e + ^m(n - m)/Ai) 



sher D, and let m and An be defined as in Section |l.2j . Further, 
where c is a constant to be determined later, and let t = C(X, P). 
The algorithm D(X,£',£) returns if t < £' + Ag/2 and 1 otherwise. The correctness of the algorithm 
follows from the claim below; its optimality is clear from the choice of P. 



Claim 3.2 With probability at least 2/3, ift x < then t < £' + At/2, and ift x > £, then t > £' + At/2. 

We give the proof of this claim in Appendix [B[ We will see in the next section that this distinguishing 
capability of D also allows us to search for an element of a desired rank nearly optimally. 



3.2 Approximating the Hh-smallest element 

Consider the problem of approximating the the /jth-smallest element in the black-box model. Recall that 
when provided with a list X = (x\, . . . , x n -i) of numbers as an oracle, and an explicit parameter A > 1/2, 
the task of is to find an input number (or the corresponding index i) such that Xi is a jth-smallest 
element for a j £ (k — A, k + A). Notice that we may round A to [A] without changing the function to 
be computed. We therefore assume that A is an integer in the sequel. 

The description of the function to be computed in terms of ranks of numbers in the input list needs to be 
given carefully, since there may be repetition of numbers in the list. To accommodate repetitions, we let 
rank(xj) denote the set of positions j £ {1, . . . , n} at which Xi could occur when the list X is arranged in 
non-decreasing order. A A-approximate /cth-smallest element is thus a number x\ such that rank(xj) n 
[k — A, k + A) is non-empty. 

In this section we give a near optimal quantum black-box algorithm for computing a A-approximate fcth- 
smallest element. No non-trivial algorithm was known for this problem for general k. Our algorithm 
is inspired by the minimum finding algorithm of Diirr and H0yer ||, and builds upon the general search 
algorithm of Boyer et al. 0] and the distinguisher of the last section obtained from the approximate counting 
algorithm of Brassard et al. |l4|, ||. To compute an e-approximate median within the bound stated in 
Corollary |1.8| , one only has to run this algorithm with the parameters k and A chosen appropriately. 



An abstract algorithm 

We first present the skeleton of our algorithm using two hypothetical procedures S(-,-) and K(-). For 
convenience, we define X-\ = — oo, and x n = oo. The procedure S(i,j) returns an index chosen uniformly 
at random from the set of indices I such that X{ < x\ < Xj, if such an index exists. The procedure K (i) 
returns 'yes' when Xi is a A-approximate /cth-smallest element of X, '<' if x has rank at most k — A 
(i.e., rank(x) D (k — A, n] = 0) and '>' if x has rank at least k + A (i.e., rank(x) n [1, k + A) = 0). 
Our algorithm, which we refer to as A(S, K), performs a binary search on the list of input numbers with 
a random pivot using S and K. It thus has the following form: 

1. % < 1, j <— n. 

2. 1<-S(i,j). 
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3. If K(l) returns 'yes', output xi (and/or /) and stop. 
Else, if K(l) returns '<', i <— I, go to step 2. 
Else, if K (I) returns '>', j <— I, go to step 2. 

Call an execution of steps 2 and 3 a stage. This algorithm always terminates and produces a correct 
solution within n — 2A + 2 stages. However, the following lemma tells us that the expected number of 
stages before termination is small. Let N = ^n/A + \Jk(n — k)/ IS.. 

Lemma 3.3 The algorithm A(S,K) terminates with success after an expected O(logiV) number of stages. 

We defer the proof of this lemma to Appendix |B| Note that the lemma guarantees that, with probability 
at least 3/4, the algorithm A(S,K) terminates within O(logiV) stages. 

We now consider the behaviour of the algorithm A when the (deterministic) procedure K(-) is replaced by 
a randomized subroutine K'{-) with the following specification. On input i (for some < i < n): 

• if Xi is a Y" a PP rox i ma t e fcth-smallest element, output 'yes'; 

• else, if rank(xj) is at most k — A, output '<'; 

• else, if rank(xj) is at least k + A, output '>'; 

• else, if rank(xj) is at least k — A + 1 and at most k — A/2, probabilistically output either 'yes' or '<'; 

• else, if rank(xj) is at least k + A/2 and at most k + A — 1, probabilistically output either 'yes' or '>'. 

The algorithm A(S, K') obtained by replacing the subroutine K(-) by K'(-) clearly also always computes a 
correct solution. Although it may require more iterations of steps 2 and 3 to arrive at a solution, we show 
that the increase is by at most a constant factor. 

Lemma 3.4 Let X be any input oracle. The expected number of stages of the algorithm A(S, K') with 
oracle X and parameter A is at most the expected number of stages of A(S,K) on inputs X and A/2. 



Appendix || contains a proof of this lemma. In light of Lemma 3^, this implies that A(S, K 1 ) also 
terminates after an expected 0(log N) number of stages. 

Finally, we analyse the behaviour of the algorithm A(S, K') when the procedures S and K' are allowed to 
either report failure or output an incorrect answer with some small probability. As mentioned above, we 
may restrict the number of stages of the algorithm to 0(log iV) and yet achieve success with probability at 
least 3/4. Now, if any of S or K' fails (or errs) with probability o(l/log iV), the net probability of success 
will still be at least, say, 2/3. 



A realization of the algorithm 

We are now ready to spell out the implementation of the two procedures S and K' out of which the 
algorithm is built. 

The subroutine S is derived from the generalized search algorithm of Boyer et al. [|j , which enables us to 
sample uniformly from the set of ones of a boolean function. 
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Theorem 3.5 (Boyer, Brassard, H0yer, Tapp) There is a quantum black-box algorithm with access to 
a boolean oracle Y = (jjq, . . . ,y n -i) that makes 0(y/n/t) queries and returns an index i chosen uniformly 
at random from the set {j : yj = 1} with probability at least 2/3 if \Y\ > t. 

Note that the success probability of the procedure described above may be amplified to 1 — 2^ r ) by 
repeating it at most 0(T) times, and returning a sample as soon as a 'one' of Y is obtained. It can easily 
be verified that a sample so generated has the uniform distribution over the ones of Y. The procedure S(i,j) 
is implemented by defining a boolean function Y = (yo, . . . ,y n ~i) by yi = 1 if and only if xi < x\ < Xj, 
and using the above sampling procedure. Every time S is invoked in A, there are at least 0(A) ones 
in Y, and hence this implementation meets the targeted specification if the parameter t in Theorem |3.5| 
is chosen to be 0(A), and the number of repetitions T of the sampler is chosen to be 0(loglogiV). Each 
"query" to the function Y requires two queries to the input oracle X. Our sampling procedure thus 
makes 0(y/n/A log log N) queries and succeeds with probability 1 — o(l/logiV). 

The subroutine K' (i) is implemented by using the dintinguisher D of Section |3~H to detect whether Xi has 
rank that is "far" from k or not, by looking at both, the number of elements smaller, and the number of 
elements larger than it. The probability of correctness of D may be boosted to 1 — 2^ T ^ by repeating 
the algorithm 0(T) times, and returning the majority of the answers so obtained. We require that the 
probability of error of our implementation be o(l/logiV), so we take T to be ©(log log N). The detailed 
description the implementation follows: 

1. If k + A — 1 > n, go to step 2, otherwise continue. Let to = \k + A/2] — 2, and t\ = k + A — 1. 
Note that < to < t\ < n, since k, A > 1. Define a boolean function Y over a domain of size n, 
with yj = 1 if and only if Xj < Xj. If the distinguisher D(Y,ta,ti) returns '0', go to step 2. Else, 
output '>'. 

2. If k — A < 0, return 'yes', otherwise continue. Let to = n — [k — A/2J — 1, and t\ = n — k + A. Note 
that we again have < to < t± < n. Define a boolean function Y over a domain of size n, with yj = 1 
if and only if Xj > X{. If the distinguisher D(Y,to,t\) returns '0', output 'y es '- Else, output '<'. 

It is easy to verify that this meets the specification for K 1 with probability 1 — o(l/logiV), and that it 
makes 0(iVloglog N) queries to the oracle X. 

By Lemma ^J, we conclude that the total number of queries made to the oracle is 0(iVlog(iV) log log N), 



as claimed in Theorem 1.7. Observe that our implementation of S and K' uses only comparisons between 
the inputs numbers, and thus may be adapted to work in the comparison tree model as well, with the same 
bound on the number of oracle queries. 



3.3 Optimal approximate counting 



Recall from Section 1.2 that the problem of computing a A- approximate count consists of computing a 



number in [0, n] which is within an additive error of A from the number of ones tx of a given boolean 
oracle input X = (xq, ■ ■ ■ ,x n _i). 

The algorithm we propose is entirely analogous to the exact counting algorithm of Brassard et al. [||, |6|, 
and we give only a sketch of it here. The algorithm consists of first invoking the procedure C(X, P) of 



Theorem 3J. a few times (say, five times), with P = cy/n/ A (for some suitable constant c), and getting 



an estimate t by taking the median of the approximate counts returned by C. With high (constant) 
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probability, this estimate is within 0(mm{tx,n — tx} + A) of the actual count tx- The algorithm then 

invokes C again, with P = ci(>/n/A + J t(n — t)/A) (for a suitable constant ci) and outputs the value 

returned by C. It is easy to verify that with high (constant) probability, the approximate count obtained 
is within the required range. An analysis similar to that of the exact counting algorithm mentioned above 
yields the bound of Theorem 1.10 on the expected number of queries made by the algorithm. 
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A Some properties of polynomials 



In this section, we present some properties of polynomials and define some concepts that we will use for 
our results. 

The symmetrization p syra of a multivariate polynomial p(xq, . . . , x n _i) is defined to be 

„sym, n n E^gS„P(g7r(0).---.^(n-l)) 

h> l x 0> ■ ■ ■ > ^n-l) — . 1 

n! 

where S n is the set of permutations on n symbols. 

If p is a multilinear polynomial of degree d, then p syra is also a multilinear polynomial of degree d. 
Clearly, p syra is a symmetric function. The following fact attributed to Minsky and Papert [13] says 
that there is a succint representation for p s y m as a univariate polynomial. 



Fact A.l If p : R n — ► R is a multilinear polynomial of degree d, then there exists a polynomial q : R — > R, 
of degree at most d, such that q(xo + x\ + • • • + x n _i) = p sym (xo, . . . , x n -i) for Xi £ {0, 1}. 



In the remainder of this section, we will deal only with univariate polynomials over the reals. 

The properties of polynomials that we use involve the concept of the uniform or Chebyshev norm of a 
polynomial (denoted by ||p||, for a polynomial p), which is defined as follows: ||p|| = max_i< x <i 
We will refer to the uniform norm of a polynomial as simply the norm of the polynomial. 

The first property we require is a bound on the value of a polynomial in an interval, given a bound on its 
values at integer points in the interval. 

Fact A. 2 Letp be a polynomial of degree d <n such that \p(i)\ < c for integers i = 0, . . . ,n. Then \p(x)\ < 
2 d ■ c for all x in the interval [0, n] . 
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This fact follows easily from an examination of the Lagrange interpolation for the polynomial p; the details 
are omitted. 

The next fact bounds the value of a polynomial outside the interval [—1,1], in terms of its norm (i.e., 
its maximum value inside the interval [—1,1]). Let T d (x) = \\(x + \/ x 2 — l) d + (x — \J x 2 — l) d ]. This 
polynomial is known as the Chebyshev polynomial of degree d. Note that \T d \ is an even function of x, and 
that \T d (l + x)\< e 2V2x+x \ for x > 0. 

Fact A. 3 Let p be a polynomial of degree at most d. Then, for \x\ > 1, 

\p{x)\ < \\p\\-\T d (x)\. 

A proof of this fact may be found in Section 2.7 of [jl7|l . We require an easy corollary of this fact. 
Corollary A. 4 If p is a polynomial of degree at most d and \p(x)\ < c for \x\ < a, for some a > 0, then 

\p(x)\ < c\T d (x/a)\ 

for all x with \x\ > a. 

At the heart of our lower bound proof is the following set of inequalities, due to Bernstein and Markov, 
which relate the size of the derivative p' of a polynomial p to the degree of p. Proofs of these may be found 



in Section 3.4 of [16] and Section 2.7 of [17|. 

Fact A. 5 Let p be a polynomial of degree d. Then, for x £ [—1, 1], 

1. (Markov) \p'{x)\ < d 2 ||j>||; 

2. (Bernstein) \/l — x 2 \p'(x)\ < d\\p\\. 

The next fact, which is a more general version of the Bernstein Inequality for algebraic polynomials, deals 
with trigonometric polynomials. A trigonometric polynomial t(x) of degree d is a real linear combination 
of the functions cos ix and sin ix, where i is an integer in the range [0, d] . For a trigonometric polynomial t , 
we define its norm to be \\t\\ = max_ vr < x < 7r |t(x)|. 

Fact A. 6 Let t be a trigonometric polynomial of degree d. Then, for x £ [— tt, it], 

\t'(x)\ < d\\t\\. 



B Proofs of some claims made in Section [3| 



Proof of Claim 3.2: Recall that m £ {£,£'} is such that |§ 



m 



is maximized, and that £' < 



We 



prove the claim when m < n/2; the analysis of the other case is symmetric and is omitted. If m < n/2, 
then m = £' . Theorem 3.1 says that with probability at least 2/3, 



\t x -t\ < Vtx(n-t x ) + \n-2t x \ 



P 



4P2 
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Then, if tx < & = m < n /2, and if c is large enough, 



yWn n 
\t-tx\ < 75 , + 



< 2 



cV^/2/A f 4(c 2 ra/A^) 
A, 



So i < tjf + A^/2 < + Ai/2. At the same time, we also have f > g(tx), where g(x) is the function 



jxn n 



P 4P 2 ' 

We show that g is an increasing function of x for x > £ and that <7(^) > £ — A^/2 = ^' + A^/2, provided c 
is chosen large enough. 

The derivative of g, 

i / \ , y/n 
q (x) = 1 -r ■= 

is an increasing function of x > 0, and if c is large enough, 



> 1 ^— r > 0, 

since £ > A^. So </(xc) > for all x > £, and 5 is increasing for such x. Moreover, if c is large enough, we 
have 

1 n < n <r A i ■ 

4pT ^ i(c 2 n/A e ) 4 ' 

2. if £' > A e , then £ = £' + A e < 2£\ and < — Mp. < ^; and 

3. if f < A«, then £ < 2A e , and # < < 

It follows from the observations made above, that 

g(£) = £■ 
> £■ 

and t > g{tx) > £ - At/2 for all X such that t x > 
This completes the proof of the claim. ■ 



£n n 



P 4P 2 
At 
2 ' 



Proof of Lemma |3.3| : We examine, for every number in the input list, the probability that it is ever 
selected in step 2 of the algorithm. The expected number of stages is the sum of these probabilities; we 
show that this sum is O(logiV). We concentrate on the case when A < k < n — A. The analysis in the 
other cases is similar. 

Consider any arrangement of the numbers in the input list in sorted order. For — 1 < i$ < k < jo < n, 
let p(Mo)Jo) denote the probability that the index of the lib. number in the sorted list is ever chosen in 
step 2 of the algorithm after i = io and j = jo. We are interested in bounding p(l,—l,n) for each / in 
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the range [0, k — A] U [k + A,n]. (The sum of these probabilities for I £ (k — A, k + A) is clearly 1.) 
Suppose I < k — A. We get the following recurrence by considering the result of the first invocation of S 
after i = i ,j = jo- 



p(l,io,h) < 



jo ~ io ~ 1 



J-i io-i 

l + E p( l ' i iijo)+ E p( l ^o,ji] 

h=io+l ji=k+A 



(The inequality is due to the fact that there may be repetitions of numbers in the input list.) Further- 
more, p(l, I — 1, k + A) < l/(k + A — I). By induction, we now get 



p{l,io,jo) < 



1 



k + A-l 

for all — 1 < iq < I < k — A and k + A < jo < n. Similarly, when I > k + A, we get 

1 



p(l,io,jo) < 



l + A-k 

for all — 1 < zq < k — A and k + A < I < j < n. The expected number of stages is thus bounded by 

1 



fc-A 

E 

i=i 



i 



fe + A-i 



l=k+A 



l + A-k 1 



which is at most 



In 



(* + A-l)(n-A; + A) +1 < ^ (2fe)(2(n - fe)) | 1 



O (log AT) 



(2A-1) 2 - A 2 

since A < k and A < n — k, and A > 1. This is the bound in the statement of the lemma. 



Proof of Lemma 3.4: Call a sequence of elements generated by some choice of random coin tosses 
of the procedure S in an execution of the algorithm A(S, K) or A(S, K') till termination, a run. We 
compare runs of the algorithm A(S, K') with parameter A with the runs of the algorithm A(S, K) with 
parameter A/2. Observe that when we condition on a set of decisions D of K' for every input index, each 
run of A(S, K') is also a prefix of runs of A(S, K), that the sum of the probabilities of the occurrence of 
the runs of A(S, K) of which a particular run of A(S, K') is a prefix, is equal to the probability of the 
occurrence of that run of A(S, K'), and, finally, that exactly one prefix of any run of A(S, K) is consistent 
with the set of decisions D we condition on. A straightforward calculation of the expected length of a run 
of A(S, K') now gives us the required bound. ■ 
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